Does syntax help discourse segmentation? Not so much
نویسندگان
چکیده
Discourse segmentation is the first step in building discourse parsers. Most work on discourse segmentation does not scale to real-world discourse parsing across languages, for two reasons: (i) models rely on constituent trees, and (ii) experiments have relied on gold standard identification of sentence and token boundaries. We therefore investigate to what extent constituents can be replaced with universal dependencies, or left out completely, as well as how state-of-the-art segmenters fare in the absence of sentence boundaries. Our results show that dependency information is less useful than expected, but we provide a fully scalable, robust model that only relies on part-of-speech information, and show that it performs well across languages in the absence of any gold-standard annotation.
منابع مشابه
Distinguishing Questions by Contour Speech Recognition Tasks
It is generally acknowledged today that, while the intonational features speakers select when they utter a sentence are not determined by the syntax, semantics or discourse context of that sentence, knowledge of these factors can help to constrain the possible intonational features speakers are likely to choose. So, while intonational variation poses a challenge to speech recognition in one sen...
متن کاملSentential Structure And Discourse Parsing
In this paper, we describe how the LIDAS System (Linguistic Discourse Analysis System), a discourse parser built as an implementation of the Unified Linguistic Discourse Model (U-LDM) uses information from sentential syntax and semantics along with lexical semantic information to build the Open Right Discourse Parse Tree (DPT) that serves as a representation of the structure of the discourse (P...
متن کاملA Study of using Syntactic and Semantic Structures for Concept Segmentation and Labeling
This paper presents an empirical study on using syntactic and semantic information for Concept Segmentation and Labeling (CSL), a well-known component in spoken language understanding. Our approach is based on reranking N -best outputs from a state-of-the-art CSL parser. We perform extensive experimentation by comparing different tree-based kernels with a variety of representations of the avail...
متن کاملFirst steps towards an ISO standard for annotating discourse relations
This paper describes initial studies in the context of a new effort within ISO to design an international standard for the annotation of discourse with semantic relations that are important for its coherence, “discourse relations”. This effort takes the Penn Discourse Treebank (PDTB) as its starting point, and applies a methodology for defining semantic annotation languages which distinguishes ...
متن کاملEFL Classroom Discourse in Iranian Context: Investigating Teacher Talk Adaptation to Students’ Proficiency Level
How language teachers talk is a key factor in organizing and facilitating learning specifically in language classrooms where the medium of instruction is also the subject matter. This study aimed to examine the extent and ways of teacher talk adaptation to students’ proficiency levels in the Iranian EFL context. Two EFL teachers who were teaching three different proficiency levels were observed...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017